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SUMMARY 

In  this  article  we  suggest  multivariate  kurtosis  measure  as  a  statistic 
for  detection  of  outliers  in  a  multivariate  linear  regression  model.  The 
statistics  has  some  local  optimal  properties. 

Some  key  words:  Multivariate  linear  recession  model.  Detection  of  outliers. 
Multivariate  kurtosis.  Locally  best  invariant  test. 

1.  INTRODUCTION 

Several  authors  have  dealt  with  the  problem  of  detection  of  outliers  in 
linear  model.  See  Cook  and  Weisberg  (1982).  However,  the  corresponding  multi¬ 
variate  problem  is  difficult  and  there  is  not  much  work  in  that  area.  For 
excellent  entensive  surveys  of  the  outlier  literature  see  Barnett  and  Lewis 
(1984).  In  this  paper  we  give  a  locally  optimum  procedure  for  detection  of 
outliers  based  onMardia's  (1970)  multivariate  sample  kurtosis.  Result  is  based 
on  extension  of  Ferguson's  (1961)  work  to  multivariate  case  on  the  similar 
lines  of  Sinha  (1984)  and  Schwager  and  Margolin  (1982).  The  idea  of  using 
Ferguson's  (1961)  work  on  outlier  detection,  with  suitable  modifications  to 
linear  regression  problems,  was  suggested  by  C.R.  Rao.  The  multivariate  problem 
is  an  offshot  of  that  idea. 

2.  NOTATIONS  AND  REDUCTION  OF  THE  PROBLEM 

Consider  the  multivariate  linear  regression  model 

Y  *  XB  +  E  ,  rank  (X)  =  m  (2.1) 
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Assume  rows  of  E  to  be  independent,  each  distributed  as  N(0,I),  i.e. 

Vec(E)  ~  N(0,  Z  ®  In).  We  write  (2.1)  in  the  form 

(V1:...:Yp)  =  (X01:...;XBp)  +  (El:...:ep)  (2.2 

1-11 

The  BLUE  of  0  is  0.  =  (X  X)  X  Y^,  1=1.2. ...p.  The  residual  vectors  are 

e  i  =  “  X  i  i  =  l ,  2, . . . ,  p 

Thus  we  have  E  «  (e  ^ . e  )  and  Vec(E)  N(C,Z  ®  Q)  where  Q  =  I  -  X  (X^X)~^x\ 

An  unbiased  estimate  of  Z  is  S  =  E  E/(n-m). 

Let  us  denote  n  row  vectors  of  p  x  1  dimension  by  e-j,  e« . e  .  If  one 

or  more  of  the  quadratic  forms 

e.jS  ^e^  i  =  1 . 2 . n 

are  unusually  large,  then  we  identify  corresponding  observations  as  outliers. 

In  the  following  we  adopt  the  procedure  due  to  Theil  (1965)  to  get  uncorrelated 
residual  vectors,  keeping  the  problem  at  hand  in  mind.  First,  we  order  the 
quadratic  forms  e^S  ^e^,  i=l,2,...,n  in  the  increasing  order  of  magnitude. 

Then,  rewrite  the  model  (2.1)  starting  with  the  row  having  smallest  e'^S  ^e^ 

and  continuing  until  the  observation  vector  with  largest  eVS  e^  is  at  the 
bottom. 

For  notational  convenience  let  us  take  the  rewritten  model  to  be  the  same 
as  (2.1).  Now  Theil's  (1965)  BLUS  method  involves  choosing  Xq  from  X,  starting 
with  the  first  row,  so  that  Xq^  exists. 

Then  (2.1)  can  be  written  as 
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(2.3) 


Now  make  the  transformation, 

Ui  =  Q11Y(n-m)i  “  ^VlVo^mi  *  1  =  1,2 . .  (2*4) 

i  11 

where  Q-j -j  =  I  -  X^(X'X)  and  is  such  that  =  0-|^0i-|  •  Each 

U.  ,  i=l,2,...,p  is  (n-m)  *  1  residual  vector  and  has  the  property  that,  if 

VJ  =  (U.| . Up)  then 

vec(vJ)  =  N(0,  Z  ®  In_m). 

That  is  to  say,  rows  of  O  are  independently  distributed  as  N(0,  Z),  the  p-variate 
normal  distribution.  Thus,  we  have  (n-m)  i.i.d.  observations  from  a  p-variate 
normal  distribution  with  mean  zero  and  covariance  matrix  Z,  and  we  want  to 
detect  whether  there  are  any  outliers  among  them.  Similar  problem  for  obser¬ 
vations  from  N(u,£)  for  unknown  u  has  been  solved  by  Schwager  and  Margolin 
(1982)  and  Smha  (1984). 

3.  FORMULATION  OF  THE  PROBI  FM  AND  MAIN  RF Sill  T 
Let  X  be  n  *  p  observation  matrix,  such  that  rows  of  X  are  independent 
and  each  row  is  a  p-variate  normal  with  mean  0  and  covariance  matrix  Z. 
Possibility  of  outliers  with  mean  slippage  can  be  incorporated  by  considering 
the  model 

X  =  AAZ*  +  ZI^  (3.1) 


with  A  a  nonzero  scalar,  A  ■=  (a  )  an  ar-bitrary  n  x  p  matrix  such  that  some 
of  the  rows  of  A  are  zero  and  Z,  mean  zero,  unit  variance,  independent  normal 
variables.  Unless  A  =  0,  the  observation  X..  corresponding  to  the  ith  row 
of  X  is  an  outlier  if  the  ith  row  of  A  is  nonzero. 

The  general  outlier  problem  then  consists  of  the  model  (3.1)  and  the 
null  hypothesis  H^rA  =  0  versus  the  alternative  H^:A  *  0.  We  derive  locally 
optimum  test  of  Hq  Vs  employing  invariance  arguments  through  the  use  of 
a  group  of  transformation  keeping  the  testing  problem  invariant. 

The  above  testing  problem  is  invariant  under  the  action  of  the  group 
G  =  p  x  G 1 ( p )  where  P  denotes  the  group  of  all  n  x  n  permutation  matrices 
with  element  T^,  G 1 ( p )  the  group  of  p  x  p  nonsingular  matrices  with  elements 
C.  The  group  operations  are  defined  by  (1)  post  multiplication  of  X  by  any 
nonsingular  matrix  CeGl(p)  and  (2)  permutation  of  the  rows  of  X  by  premulti¬ 
plying  X  by  P^eP.  Without  loss  of  generality  assume  l  =  I. 

The  following  lemma  due  to  Wijsman  (1967)  is  taken  from  Sinha  (1984). 
Lemma  3. 1  Let  h(x/A)  be  the  pdf  of  x,  let  T  =  t(x)  be  a  maximal  invariant 
under  the  transformation  G  and  let  P^  be  the  distribution  induced  by  T  under 
A.  Then  the  pdf  of  T  w.r.t.  pI  evaluated  at  T  =  t(x)  is  given  by 


h(g*x/A)|C  ' C l  n^dv(g) 


f  h(a»x/A=0)  IC'Cln^dv(a) 


(3.2) 


where  vis  left  invariant  measure  on  G.  Here  g*x  =  F^xC,  T^eP,  CeGl(p)  and 

v=  Vj  x  v  v.j  is  discrete  uniform  probability  measure  with  mass  1/n!  at  each 

of  the  n1  elements  T  cP  and 

a 


Lemma  3.2  The  ratio  in  (3.2)  reduces  to 


ij  etr  -  |  {C'C  -  ZAC’S-CMO'A  +  A2A'A}|C'C|  2  dC 

-  GHP? _ ! _ 

n-p 

Ea  etr(-  -  C'C)|C'C|  2  dC 


G1(P) 


(3.3) 


Proof  is  easy  proceeding  on  the  similar  lines  as  in  Sinha  (1984). 

Now  we  proceed  to  evaluate  the  expression  in  (3.3).  An  exact  evaluation 

of  the  expression  is  not  necessary  to  evaluate  locally  best  invariant  test. 

We  use  Taylor  series  expansion  upto  a  few  terms  evaluated  at  A  =  0.  Making 

a  transformation  from  C  to  -C,  it  is  clear  from  (3.3)  that  the  ratio  of  the 

2 

pdf's  depend  only  on  A  .  Let  and  Nq  be  the  numerator  and  the  denominator 
of  (3.3)  respecti vely.  We  assume  the  conditions  for  taking  derivative  inside 
the  integral  signs  hold.  Then,  using  Taylor  expansion  we  write 

Na  *  No  +  NoA  +  nq2)  fl  +  <3)  TT  +  n04>  IT  + - 


No  +  No?)  TT  *  n<4>  f)  *  .... 


Using  the  results  (Lemma  4.1)  of  Schwager  and  Margolin  (1983)  we  can  easily 

A2 

show  that  coefficient  of  j-p  , 

*  i  n-D 

i  9  itrc  C  i  ~7~jr 

-tr(A'A)N0  +  £aj  ft  C'S~2(raX)'A]2  e’T  |C  C|  '  dC’ 

Gl(p) 


is  a  constant.  The  coefficient  of  apart  from  a  constant  is 

n-p 

I  [trAC'S-^(r  X)' ]4  e"^trC'C  |C'C|  2  dC 
Gl(p) 


(3.4) 


Let  T(x)  =  b~  =  n  Z  (X  S  X  )  be  multivariate  kurtosis  measure  defined 

^•p  i»i  1 

as  in  Mardia  (1970).  Let  L(A)  be  such  that 

n(n-l)  L(A)  =  (n-2)  Z  ||  r  ||  4  -  3(  Z  ||r.||  2)?  (3.5 

1=1  1  i  =  l  1 

2  x  I 

where  ||  r.  ||  =  a^  a..,  a.  is  the  iL  row  of  A. 

Now  (3.4)  apart  from  a  constant  can  be  written,  using  the  results  due  to 
Ferguson  (1961),  Schwager  and  Margolin  (1982)  and  Sinha  (1984),  as 

c-jT( x)  L(A)  +  C?.  (3.6 

Then  we  have  the  following  theorem. 

Theorem  For  the  outlier  problem  discussed,  the  locally  best  invariant  test 
of  HQ:A  =  0  Vs  H^A  *  0  conditional  on  A,  is:  if  L(A)  >  0,  reject  HQ  whenever 

bp  p  s  k;  if  L( A)  <  0,  reject  whenever  bp  <  k'.  The  constants  k, k '  are 

determined  by  the  size  of  the  test  and  L(A)  is  the  function  of  A  given  in 
(3.5). 

Proof  Application  of  Lemma  3.2  and  the  generalized  Neyman-Pearson  Lemma  along 

with  (3.6)  completes  the  proof  of  the  theorem. 

One  can  use  asymptotic  distribution  of  b~  ,  obtained  by  Mardia  (1970), 

P 

to  find  the  cutoff  points  k.k1.  Or  else,  in  specific  problems,  one  can  use 
simulation  to  compute  k.k'. 

Now  returning  back  to  the  multivariate  regression  model  considered  in 
section  2;  we  test  the  hypothesis  A  =  0  Vs  A  x  0  using  the  uncorrelated 
residual  vectors  obtained  in  (2.4)  and  applying  the  above  theorem.  If  the 
hypothesis  is  rejected  then  we  identify  the  observation  corresponding  to  the 


largest  e^'S  as  an  outlier.  Removing  the  outlier  observation  from  the 

data,  further  testing  can  be  done  for  more  outliers. 

We  would  like  to  remark  that  the  kurtosis  measure  is  very  sensitive  for 
the  presence  of  outliers  and  hence  is  a  very  useful  tool  for  detection  of 
outliers.  This  fact,  at  least  in  the  case  of  univariate  regression  models, 
was  realized  in  a  data  analysis  problem  considered  by  Vaidya  (1985). 
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